Reducing computational load in segmental hidden Markov model decoding for speech recognition

نویسنده

M. J. Russell

چکیده

Introduction: Research into segment models (SMs) for automatic speech recognition is motivated by limitations of conventional hidden Markov models (HMMs). While HMMs associate states with individual feature vectors, SMs associate states with sequences of vectors (segments) [1], or variable duration acoustic features [2], thereby allowing important static and dynamic structure to be modelled. Glass [2] reports state-of-the-art phone recognition on TIMIT [3] using an SM. Segmental HMMs (SHMMs) can outperform comparable HMMs [4], but computational load increases. In standard notation, the basic step in HMM Viterbi decoding for a sequence of vectors y1, . . . , yt, . . . , yT is: atðiÞ 1⁄4 max j at 1ð jÞajibiðytÞ ð1Þ

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performing Speech Recognition on Multiple Parallel Files Using Continous Hidden Markov Models on an FPGA

Speech recognitioii is a cornpntationally demanding task, particularly the stages which use Viterbi decoding for coiiwrfifig pre-processed speech data into words or subword mit, and the associated observation probability calciilatioris. which employ nzulrivariate Gaussian disrribufions: so any device that can reduce the load on, for example. a PC’s processor, is advantageous. Hence we preseiir ...

متن کامل

Telephone Speech Recognition via the Combination of Knowledge Sources in a Segmental Speech Model

The currently dominant speech recognition methodology, Hidden Markov Modeling, treats speech as a stochastic random process with very simple mathematical properties. The simplistic assumptions of the model, and especially that of the independence of the observation vectors have been criticized by many in the literature, and alternative solutions have been proposed. One such alternative is segme...

متن کامل

Ginisupport vector machines for segmental minimum Bayes risk decoding of continuous speech

We describe the use of Support Vector Machines (SVMs) for continuous speech recognition by incorporating them in Segmental Minimum Bayes Risk decoding. Lattice cutting is used to convert the Automatic Speech Recognition search space into sequences of smaller recognition problems. SVMs are then trained as discriminative models over each of these problems and used in a rescoring framework. We pos...

متن کامل

Efficient Methods for Automatic Speech Recognition

This thesis presents work in the area of automatic speech recognition (ASR). The thesis focuses on methods for increasing the efficiency of speech recognition systems and on techniques for efficient representation of different types of knowledge in the decoding process. In this work, several decoding algorithms and recognition systems have been developed, aimed at various recognition tasks. The...

متن کامل

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2000

Reducing computational load in segmental hidden Markov model decoding for speech recognition

نویسنده

چکیده

منابع مشابه

Performing Speech Recognition on Multiple Parallel Files Using Continous Hidden Markov Models on an FPGA

Telephone Speech Recognition via the Combination of Knowledge Sources in a Segmental Speech Model

Ginisupport vector machines for segmental minimum Bayes risk decoding of continuous speech

Efficient Methods for Automatic Speech Recognition

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

عنوان ژورنال:

اشتراک گذاری